-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: incorrect behaviour setting NOT NULL on column and dropping it #62167
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. This didn't end up being too bad.
Reviewed 2 of 5 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @fqazi)
pkg/sql/schema_changer.go, line 1049 at r1 (raw file):
var committedMutations []descpb.DescriptorMutation var scTable *tabledesc.Mutable
How do you feel about just computing the set of jobs inside the txn and stashing that here as opposed to these constituent pieces?
I've become mildly allergic to descriptors retained outside of transactions. It's too easy to keep them around and then forget that properties it has or doesn't have.
pkg/sql/schema_changer.go, line 2567 at r1 (raw file):
// by a later operation. The nextMutationIdx provides the index at which to check for // later mutation. func isCurrentMutationDiscarded(
Are there other interactions which are interesting? What about indexes being built? FK constraints?
pkg/sql/schema_changer_test.go, line 7040 at r1 (raw file):
defer sqltestutils.SetTestJobsAdoptInterval()() ctx := context.Background() hookOnJobResume := true
I have a feeling you're going to need some synchronization on this.
pkg/sql/schema_changer_test.go, line 7066 at r1 (raw file):
var schemaChangeWaitGroup sync.WaitGroup // Sub test 1: with concurrent drop and mutations
nit: rework this into actual subtests where you take the top part and rework it into a setup function.
pkg/sql/schema_changer_test.go, line 7076 at r1 (raw file):
defer schemaChangeWaitGroup.Done() nextStep <- 1 conn2.Exec(t,
turns out you can't use the sqlrunner in another goroutine. You have to either call t.Error or pass the error back to the main goroutine on a channel.
pkg/sql/schema_changer_test.go, line 7081 at r1 (raw file):
ALTER TABLE t ALTER COLUMN j SET NOT NULL; ALTER TABLE t ADD COLUMN k INT8 DEFAULT 42 NOT NULL UNIQUE; COMMIT TRANSACTION;
TIL I don't know that I've seen COMMIT TRANSACTION
spelled out before
pkg/sql/schema_changer_test.go, line 7106 at r1 (raw file):
// Allow both backfill jobs to proceed proceedBeforeBackfill <- nil proceedBeforeBackfill <- nil
Just FYI I don't think the backfills will actually be concurrent. Did you see
cockroach/pkg/sql/schema_changer.go
Lines 564 to 567 in 7444c8a
// Check that we aren't queued behind another schema changer. | |
if err := sc.notFirstInLine(ctx, desc); err != nil { | |
return err | |
} |
pkg/sql/schema_changer_test.go, line 7170 at r1 (raw file):
Quoted 5 lines of code…
// Allow the first backfill to succeed and the // second one with the drop to fail, which should // cause both to return errors. proceedBeforeBackfill <- nil proceedBeforeBackfill <- errors.Newf("Bogus error for drop column transaction")
There's no ordering on when concurrent channel receives happen. You may also want to consider reworking when you receive notification so that you can know definitively which job you're unblocking. Alternatively you could plumb more info into the knob and then pass it along in a struct on the chan. You may want a schema where you do something like:
backfillEventChan := make(chan chan error)
//...
RunBeforeBackfill: func() error {
errCh := make(chan error)
backfillEventChan <- errCh
return <-errCh
},
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/sql/schema_changer.go, line 1049 at r1 (raw file):
Previously, ajwerner wrote…
var committedMutations []descpb.DescriptorMutation var scTable *tabledesc.Mutable
How do you feel about just computing the set of jobs inside the txn and stashing that here as opposed to these constituent pieces?
I've become mildly allergic to descriptors retained outside of transactions. It's too easy to keep them around and then forget that properties it has or doesn't have.
Done
pkg/sql/schema_changer_test.go, line 7040 at r1 (raw file):
Previously, ajwerner wrote…
I have a feeling you're going to need some synchronization on this.
Done.
pkg/sql/schema_changer_test.go, line 7076 at r1 (raw file):
Previously, ajwerner wrote…
turns out you can't use the sqlrunner in another goroutine. You have to either call t.Error or pass the error back to the main goroutine on a channel.
Done.
pkg/sql/schema_changer_test.go, line 7081 at r1 (raw file):
Previously, ajwerner wrote…
TIL I don't know that I've seen
COMMIT TRANSACTION
spelled out before
Came from sqlfumpt, cleaned it up :)
pkg/sql/schema_changer_test.go, line 7170 at r1 (raw file):
Previously, ajwerner wrote…
// Allow the first backfill to succeed and the // second one with the drop to fail, which should // cause both to return errors. proceedBeforeBackfill <- nil proceedBeforeBackfill <- errors.Newf("Bogus error for drop column transaction")
There's no ordering on when concurrent channel receives happen. You may also want to consider reworking when you receive notification so that you can know definitively which job you're unblocking. Alternatively you could plumb more info into the knob and then pass it along in a struct on the chan. You may want a schema where you do something like:
backfillEventChan := make(chan chan error) //...
RunBeforeBackfill: func() error { errCh := make(chan error) backfillEventChan <- errCh return <-errCh },
Done.
Thanks, @ajwerner. I reworked the test internals to be saner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you've got some linter things to fix but
Reviewed 1 of 5 files at r1, 1 of 2 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @fqazi)
pkg/sql/schema_changer.go, line 1049 at r2 (raw file):
var didUpdate bool var depMutationJobs []jobspb.JobID var scTable *tabledesc.Mutable
I don't think we need to declare scTable outside the txn anymore
pkg/sql/schema_changer.go, line 1547 at r2 (raw file):
// the rollback completed the job was adopted. // the rollback completed the job was adopted.
nit: duplicated line
Fixes: cockroachdb#47719 Previously, if a column was mutated and then dropped in a single transaction we would still try to validate the dropped column. Additionally, in general if a drop operation occurred in a different transaction we ran the danger of doing something similar. From a behaviour perspective this was bad, since it can lead to unexpected behaviour for users. To address, this patch introduces logic to scan later mutations to check if a drop column occurs. If one does occur then it will skip any operations made irrelevant by the drop. Additionally, for a correctness perspective it will wait for the drop to complete before returning back. Release note (bug fix): A constraint like not null or check on a column made irrelevant by a drop in a later concurrent transaction would lead to incorrect errors / behaviour.
bors r=ajwerner |
Build succeeded: |
Fixes: #47719
Previously, if a column was mutated and then dropped in a single
transaction we would still try to validate the dropped column.
Additionally, in general if a drop operation occurred in a
different transaction we ran the danger of doing something
similar. From a behaviour perspective this was bad, since
it can lead to unexpected behaviour for users. To address, this
patch introduces logic to scan later mutations to check if
a drop column occurs. If one does occur then it will skip any
operations made irrelevant by the drop. Additionally, for a
correctness perspective it will wait for the drop to complete
before returning back.
Release note (bug fix): A constraint like not null or check on
a column made irrelevant by a drop in a later concurrent transaction
would lead to incorrect errors / behaviour.